cv error
Regularization Path of Cross-Validation Error Lower Bounds
Atsushi Shibagaki, Yoshiki Suzuki, Masayuki Karasuyama, Ichiro Takeuchi
Careful tuning of a regularization parameter is indispensable in many machine learning tasks because it has a significant impact on generalization performances. Nevertheless, current practice of regularization parameter tuning is more of an art than a science, e.g., it is hard to tell how many grid-points would be needed in cross-validation (CV) for obtaining a solution with sufficiently small CV error. In this paper we propose a novel framework for computing a lower bound of the CV errors as a function of the regularization parameter, which we call regularization path of CV error lower bounds . The proposed framework can be used for providing a theoretical approximation guarantee on a set of solutions in the sense that how far the CV error of the current best solution could be away from best possible CV error in the entire range of the regularization parameters. Our numerical experiments demonstrate that a theoretically guaranteed choice of a regularization parameter in the above sense is possible with reasonable computational costs.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Japan (0.04)
Export Reviews, Discussions, Author Feedback and Meta-Reviews
Thank you for fruitful comments. We would like to rebut this criticism from the following two points. We would like to emphasize that this restriction is identical with assuming that the loss function is strongly convex. A huge body of theoretical works on convex empirical risk minimization problems have been devoted to the problems with strongly convex loss functions. If the reviewers claim that the scope of our work is narrow, the same criticism should be applied to those past works targeted to strongly convex loss functions.
Regularization Path of Cross-Validation Error Lower Bounds
Careful tuning of a regularization parameter is indispensable in many machine learning tasks because it has a significant impact on generalization performances.Nevertheless, current practice of regularization parameter tuning is more of an art than a science, e.g., it is hard to tell how many grid-points would be needed in cross-validation (CV) for obtaining a solution with sufficiently small CV error.In this paper we propose a novel framework for computing a lower bound of the CV errors as a function of the regularization parameter, which we call regularization path of CV error lower bounds.The proposed framework can be used for providing a theoretical approximation guarantee on a set of solutions in the sense that how far the CV error of the current best solution could be away from best possible CV error in the entire range of the regularization parameters.We demonstrate through numerical experiments that a theoretically guaranteed a choice of regularization parameter in the above sense is possible with reasonable computational costs.
Regularization Path of Cross-Validation Error Lower Bounds
Careful tuning of a regularization parameter is indispensable in many machine learning tasks because it has a significant impact on generalization performances. Nevertheless, current practice of regularization parameter tuning is more of an art than a science, e.g., it is hard to tell how many grid-points would be needed in cross-validation (CV) for obtaining a solution with sufficiently small CV error. In this paper we propose a novel framework for computing a lower bound of the CV errors as a function of the regularization parameter, which we call regularization path of CV error lower bounds. The proposed framework can be used for providing a theoretical approximation guarantee on a set of solutions in the sense that how far the CV error of the current best solution could be away from best possible CV error in the entire range of the regularization parameters. Our numerical experiments demonstrate that a theoretically guaranteed choice of a regularization parameter in the above sense is possible with reasonable computational costs.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Japan (0.04)
Rademacher upper bounds for cross-validation errors with an application to the lasso
Xu, Ning, Fisher, Timothy C. G., Hong, Jian
We establish a general upper bound for $K$-fold cross-validation ($K$-CV) errors that can be adapted to many $K$-CV-based estimators and learning algorithms. Based on Rademacher complexity of the model and the Orlicz-$\Psi_{\nu}$ norm of the error process, the CV error upper bound applies to both light-tail and heavy-tail error distributions. We also extend the CV error upper bound to $\beta$-mixing data using the technique of independent blocking. We provide a Python package (\texttt{CVbound}, \url{https://github.com/isaac2math}) for computing the CV error upper bound in $K$-CV-based algorithms. Using the lasso as an example, we demonstrate in simulations that the upper bounds are tight and stable across different parameter settings and random seeds. As well as accurately bounding the CV errors for the lasso, the minimizer of the new upper bounds can be used as a criterion for variable selection. Compared with the CV-error minimizer, simulations show that tuning the lasso penalty parameter according to the minimizer of the upper bound yields a more sparse and more stable model that retains all of the relevant variables.
- North America > United States > New York (0.04)
- Oceania > Australia > New South Wales > Sydney (0.04)
Regularization Path of Cross-Validation Error Lower Bounds
Shibagaki, Atsushi, Suzuki, Yoshiki, Karasuyama, Masayuki, Takeuchi, Ichiro
Careful tuning of a regularization parameter is indispensable in many machine learning tasks because it has a significant impact on generalization performances.Nevertheless, current practice of regularization parameter tuning is more of an art than a science, e.g., it is hard to tell how many grid-points would be needed in cross-validation (CV) for obtaining a solution with sufficiently small CV error.In this paper we propose a novel framework for computing a lower bound of the CV errors as a function of the regularization parameter, which we call regularization path of CV error lower bounds.The proposed framework can be used for providing a theoretical approximation guarantee on a set of solutions in the sense that how far the CV error of the current best solution could be away from best possible CV error in the entire range of the regularization parameters.We demonstrate through numerical experiments that a theoretically guaranteed a choice of regularization parameter in the above sense is possible with reasonable computational costs. Papers published at the Neural Information Processing Systems Conference.
Weighted graphlets and deep neural networks for protein structure classification
Guo, Hongyu, Newaz, Khalique, Emrich, Scott, Milenkovic, Tijana, Li, Jun
To whom correspondence should be addressed. Abstract As proteins with similar structures often have similar funct ions, analysis of protein structures can help predict protein functions and is thus imp ortant. We consider the problem of protein structure classification, which computati onally classifies the structures of proteins into predefined groups. We develop a weighted network that depicts the protein structures, and more importantly, we propose the firs t graphlet-based measure that applies to weighted networks. Further, we develop a de ep neural network (DNN) composed of both convolutional and recurrent layers to use this measure for classification. Put together, our approach shows dramatic improvements in performance over existing graphlet-based approaches on 36 real datasets. E ven comparing with the state-of-the-art approach, it almost halves the classification error. In addition to protein structure networks, our weighted-graphlet measure and DNN cla ssifier can potentially be applied to classification of other weighted networks in computational biology as well as in other domains. Proteins are the building molecules of life, and their diver se functions define the mechanisms of sophisticated organisms [1].
- North America > United States > Indiana > St. Joseph County > Notre Dame (0.04)
- North America > United States > Tennessee > Knox County > Knoxville (0.04)
- Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
Fusing heterogeneous data sets
In systems biology, it is common to measure biochemical entities at different levels of the same biological system. One of the central problems for the data fusion of such data sets is the heterogeneity of the data. This thesis discusses two types of heterogeneity. The first one is the type of data, such as metabolomics, proteomics and RNAseq data in genomics. These different omics data reflect the properties of the studied biological system from different perspectives. The second one is the type of scale, which indicates the measurements obtained at different scales, such as binary, ordinal, interval and ratio-scaled variables. In this thesis, we developed several statistical methods capable to fuse data sets of these two types of heterogeneity. The advantages of the proposed methods in comparison with other approaches are assessed using comprehensive simulations as well as the analysis of real biological data sets.
- Europe > Austria > Vienna (0.13)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- Asia > China (0.04)
- (6 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.67)
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Education (1.00)
Cross validation in sparse linear regression with piecewise continuous nonconvex penalties and its acceleration
Obuchi, Tomoyuki, Sakata, Ayaka
We investigate the signal reconstruction performance of sparse linear regression in the presence of noise when piecewise continuous nonconvex penalties are used. Among such penalties, we focus on the smoothly clipped absolute deviation (SCAD) penalty. The contributions of this study are three-fold: We first present a theoretical analysis of a typical reconstruction performance, using the replica method, under the assumption that each component of the design matrix is given as an independent and identically distributed (i.i.d.) Gaussian variable. This clarifies the superiority of the SCAD estimator compared with $\ell_1$ in a wide parameter range, although the nonconvex nature of the penalty tends to lead to solution multiplicity in certain regions. This multiplicity is shown to be connected to replica symmetry breaking in the spin-glass theory, and associated phase diagrams are given. We also show that the global minimum of the mean square error between the estimator and the true signal is located in the replica symmetric phase. Second, we develop an approximate formula efficiently computing the cross-validation error without actually conducting the cross-validation, which is also applicable to the non-i.i.d. design matrices. It is shown that this formula is only applicable to the unique solution region and tends to be unstable in the multiple solution region. We implement instability detection procedures, which allows the approximate formula to stand alone and resultantly enables us to draw phase diagrams for any specific dataset. Third, we propose an annealing procedure, called nonconvexity annealing, to obtain the solution path efficiently. Numerical simulations are conducted on simulated datasets to examine these results to verify the consistency of the theoretical results and the efficiency of the approximate formula and nonconvexity annealing.
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
- North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (2 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (0.80)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.72)
Logistic principal component analysis via non-convex singular value thresholding
Song, Yipeng, Westerhuis, Johan A., Smilde, Age K.
Multivariate binary data is becoming abundant in current biological research. Logistic principal component analysis (PCA) is one of the commonly used tools to explore the relationships inside a multivariate binary data set by exploiting the underlying low rank structure. We re-expressed the logistic PCA model based on the latent variable interpretation of the generalized linear model on binary data. The multivariate binary data set is assumed to be the sign observation of an unobserved quantitative data set, on which a low rank structure is assumed to exist. However, the standard logistic PCA model (using exact low rank constraint) is prone to overfitting, which could lead to divergence of some estimated parameters towards infinity. We propose to fit a logistic PCA model through non-convex singular value thresholding to alleviate the overfitting issue. An efficient Majorization-Minimization algorithm is implemented to fit the model and a missing value based cross validation (CV) procedure is introduced for the model selection. Our experiments on realistic simulations of imbalanced binary data and low signal to noise ratio show that the CV error based model selection procedure is successful in selecting the proposed model. Furthermore, the selected model demonstrates superior performance in recovering the underlying low rank structure compared to models with convex nuclear norm penalty and exact low rank constraint. A binary copy number aberration data set is used to illustrate the proposed methodology in practice.
- Europe > Netherlands > South Holland > Leiden (0.04)
- Europe > Netherlands > North Holland > Amsterdam (0.04)